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Abstract: Employing recent results of Robinson (2005) we consider the asymp- 
totic properties of conditional-sum-of-squares (CSS) estimates of parametric 
models for stationary time series with long memory. CSS estimation has been 
considered as a rival to Gaussian maximum likelihood and Whittle estimation 
of time series models. The latter kinds of estimate have been rigorously shown 
to be asymptotically normally distributed in case of long memory. However, 
CSS estimates, which should have the same asymptotic distributional prop- 
erties under similar conditions, have not received comparable treatment: the 
truncation of the infinite autoregressive representation inherent in CSS esti- 
mation has been essentially ignored in proofs of asymptotic normality. Unlike 
in short memory models it is not straightforward to show the truncation has 
negligible effect. 



1. Introduction 

Consider a real- valued, strictly and covariance stationary time series a;t, t G Z. It 
is believed that xt has a parametric autoregressive (AR) representation 

oo 

(1.1) '^aj{e())xt^j ^Et, t e Z. 

j=o 

Here St is a sequence of zero-mean, uncorrelated and homoscedastic random vari- 
ables, with variance CTq, the cx.j{9) are given functions with p x 1 vector argument 
6*, 6*0 is an unknown p x 1 vector, and aQ{6) = 1 for all B. 

The range of structures \oij{QW covered by (1.1) is very broad, but of interest 
to us are ones which allow xt to have long memory. Usually, these are "fractional" , 
where it is assumed that the function 

oo 

(1.2) a(s;0)=^a,(^)s^ 

with complex-valued argument s on the unit circle, is of form 

(1.3) a(s;0) = (1 -s)*''^)a*(s;^), 
where is a scalar function of B such that 

(1.4) < 8{B^) < \ 
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and 

(1.5) 0<\a*{s;9o)\<^, \s\ = 1. 

It follows that Xt has spectral density of form 



^0 _ 2 



h _ giA|-2<5(0o) 



(1.6) /(A) = y ^ = (7' „ . 

The leading choice of a* is a rational function of s, in which case xt is said to be 
a fractional autoregressive integrated moving average (FARIMA) model; 6{9o) is 
caled the memory parameter. 

Leading methods of estimation of 6*0, given observations xi, . . . ,Xn, are Gaussian 
maximum likelihood (ML), and approximations thereto. They are "approximations" 
in the sense that under similar conditions they have the same asymptotic normal 
distribution as ML, and are thus asymptotically efficient under Gaussianity. At 
the same time, under many departures from Gaussianity, though the efficiency 
is lost the limit normal distribution of all these estimates is unaffected. Assuming 
Gaussianity, asymptotic normality of one form of approximation, a Whittle estimate 
involving integration over frequency, was first established by Fox and Taqqu 0] , and 
then by Dahlhaus "s*] in case of ML estimation. Giraitis and Surgailis Q established 
asymptotic normality for the estimate considered by Fox and Taqqu ^] when St need 
not be Gaussian but is independent and identically distributed with finite fourth 
moment. Due to the pole in the spectral density at A = (see (1.6)), the asymptotic 
normality proofs are considerably more challenging than those of Hannan ^] for 
short memory time series models, incisive though these were for such models. 

An alternative estimate that has been considered in the literature is conditional- 
sum-of-squares (CSS) estimation, which was previously employed by Box and Jenk- 
ins [H for short memory time series models. Define 

t-i 

(1.7) et{9) = Y,a,{e)xt-j, 

3=0 



1 " 

(1.8) s„(0) = -^e?(^), 

77 Z d 



t=l 

and estimate by 

(1.9) 9n = argmin 5^(6*), 

where C is a compact set. 

One can motivate 6'„ by the hope that s„(0o) is a good approximation to x 
^JLj^e^ , which is itself proportional to the exponent in the density function of in- 
dependent identically distributed zero-mean normal variates. Thus one hopes that 
(after centering at 9q and norming) 9n has the same limit distributional prop- 
erties as the Gaussian ML and Whittle estimates mentioned previously. 

Given an initial consistency proof of 9m a standard approach to proving asymp- 
totic normality entails applying the mean value theorem to r'„(0„) about 9q, where 

(1.10) r^i9) = ^^lj2''met{e), 
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for 

det{9) 



(1.11) hie) 



80 



The main part of the proof then involves estabUshing that n'^rn{9o) converges in 
distribution to a zero-mean normal vector. If the St are assumed to be conditionally 
homoscedastic martingale differences, and conditions ensuring that ht{0) has finite 
variance are imposed, such convergence is easily seen to hold (see e.g. [2|) for 

2 " 

(1-12) rU0o) = -Y^htsu 

where ht ~ ht{Oo). However this is only useful if also 

(1.13) r*„{0o)-rni0o) = op(n-i), 

in other words, if the effect of replacing et — et{0o) by et is sufficiently small. Unlike 
the htSt, the hfCt and ht{et — et) are not zero-mean, orthogonal random variables. 
We can employ the Schwarz inequality: 

n 

(1.14) E\rl{0o)-r,,{0o)\<2n-^J2 [E{et - et)^E \\ht{0t{0oW 

t=i 

Then if, say, it were true that E{et — £tY = Op(t^^^'') for some 77 > 0, the right 
hand side of (1-14) would be Op{n~^~'i), and (1.13) established. For short memory 
models E{et — £4)^ typically decays fast enough, indeed even exponentially. But 
under quite general conditions permitting long memory (see 0]), 

(1.15) E{et - stf < Kt-^ 

only, where K is an arbitrarily large generic constant, which is insufficient to es- 
tablish (1.13) using (1.14). 

A more delicate proof of (1.13) is required, and this was given by Robinson 8]. 
As discussed there, this delicacy relates to that seen in the proofs of Fox and Taqqu 
[3| and others for alternative estimates of 6*0. Indeed, given that these estimates and 
CSS should have the same limit distributional properties, it would be extraordinary 
if the proof for CSS were very much easier than for the other estimates. 

A central limit theorem for 0„ is given in Section 3. Prior to that, in the follow- 
ing section, we provide the almost sure convergence of 0n (under somewhat more 
general conditions). Hannan Q proved this for various estimates, assuming strict 
stationarity and ergodicity, which is consistent with long memory. However, he did 
not cover CSS estimation. 



2. Almost sure convergence 

In the present section we do not require that Xt necessarily has spectral density of 
form (1.6), with (1.5) holding, but simply that it is a zero-mean strictly stationary, 
ergodic process with AR representation (1.1), with the sentence after (1.1) holding, 
and also 6*0 G 8, for all e e\{6lo} 



(2.1) 



a{s]0) ^ a{s]0o) 
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on a subset of \s\ = 1 of positive measure, |a(s; 6)\ is continuous in 9 for all s : \s\ = 
1, and 

oc 

(2.2) Vsup \a.j{0)\ < oo. 

Condition (2.1) is a standard identifiability condition, and (2.2) is reasonable in that 
long memory models (e.g. (1.6), such as FARIMAs) typically have AR representa- 
tions with summable coefficients. Note that this setup allows the spectral density 
to have poles at non-zero frequencies (as in certain cyclic and seasonal models), 
whereas (1.6) does not, in view of (1.5). 

Theorem 1. Under the above conditions 

(2.3) lim 9n = ^0, a.s. 

n — *oo 

Proof. Theorem 1 of Hannan ^] and Theorem 1 of Fox and Taqqu [4,] cover the 
estimate 

(2.4) 9n ^ iirgmm si{0), 

where sJi(6') is the objective function for the integral form of Whittle estimate, i.e. 
o'%{9) of Hannan 0] or a%{9) of Fox and Taqqu [J. We can write 

(2.5) si{9) = CniOMd) + 2j2cn{Mj{0), 
where 

^ n~-j 

(2.6) Cn{j) = -^XtXt+j, 0<j<71-l, 

t=l 

oo 

(2.7) ^j{9) = Y.^k{e)ak+j{e). 

k=0 

From Theorem 1 of Hannan [6.], and its proof, it is clear that it suffices to show 
that 

(2.8) lim sup 14(6') -Sn (61) I =0, a.s. 

n— >oo Q 

Now 

^ n oo 

71-^ — ' — ' 

t=l fc=n-i+l 

2 n—ln—j oo 

(2.9) E MO)ak+j{e) 

j=\t=\ 

4 



n • 

j = lt=l fc=n-i-j + l 

4 
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where 

(2.10) a,r.{9) = 7(0) <^ + E"'(^) r ' 



oo 



(2.11) «2„(0) = iX^{x?-7(O)} "'(^)' 

^ n— 1 n~j oo 

(2.12) a3„(0) = -^7(j)E E MOhk+Ae), 

j = l t=lk=n-t-j + l 

(2.13) a4„(f?) = 2^^ -^(a;,Xt+,-7(j)) E ^ 

i=l [ k=n-t-] + l J 

where 

(2.14) 7(j) = cov(a;o,a;j). 
It remains to prove 

(2.15) Urn sup|a„(6l)| = a.s., i = 1,2,3,4. 

As the proofs for i = 1, 2 are similar to but simpler than those for i = 3, 4, we give 
only the latter. We have 

(2.16) sup \asn{0)\ < -E I7(J)I <! E^^^P 

The quantity in braces is finite and since, by the Riemann-Lebesgue theorem, ex- 
istence of a spectral density implies limj^oo 7(i) — 0, it follows from the Toeplitz 
lemma that (2.16) — > as n — > oo. Next, by summation-by-parts 

(2.17) 



^E^E {^*^*+^- - ^(■?')} E"fe(^)"'=+j(^)- 



j=i t=i fc=i 
The modulus of the first term on the right has supremum, over 0, bounded by 

n 

(2.18) i^V sup |ct(j)-7(j)|sup|a„_t+i(0)| 

^l<j<n e 

using (2.2). Using (2.2) again, and Theorem 1 of Hannan [7| and the Toeplitz 
lemma, it follows that (2.18) is o(l) a.s. The second term in (2.17) can be similarly 
handled. □ 



3. Asymptotic normality 



We assume now in addition that xt has spectral density (1.6), with (1.4), (1.5) 
satisfied, that 9^ is an interior point of O, that the £j in (1.1) are independent with 
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zero mean, variance CTq and uniformly bounded fourth moment, that a{s; 9) is twice 
continuously differentiable in 6, and that the matrix 



(3.1) 0.= 



2^. 

is positive definite. 





logl-e*^^ 




logl-e^^f^ 


f — TT 


. -2^1og|a(e*^^o)| . 




. -2^1og|a(e»^0o) 



Theorem 2. Under the above conditions, as n ^ oo n^{9n — do) converges in 
distribution to a p-variate normal vector with zero mean and covariance matrix 

Proof. As discussed in Section 1, we have 

(3.2) = rM = rM + ta{9„ - Oo), 

where T„ is the matrix formed by evaluating, for i = 1, . . . ,p, the i-th row of the 
matrix r„(6') = {d"^ / dOdO' )s„{0) at = 0i, where 
denoting Euclidean norm. 
Define 



i - col 



< 



7n — col 



(3.3) 

so that 

(3.4) 

and define also 
(3.5) 

(3.6) 



t-1 



Pt = ^CjXt- 



J' 



1 
t=i 

Write rn{0o) - rn = ri„ + r2„ + r3„, where 

n 

(3.7) ri„ = 2n-i5^(/it-pt)£t> 

n 

(3.8) r2„ = 2n-i^pt(et-et), 

t=i 

n 

(3.9) r3„ = 2n-i^(/it - pt){et - £*). 

We show that rin = Op{n~^), i = 1,2, 3. To deal with ri„, we may write 

oo oo 

(3.10) ht- pt = -y^Q^t-J = 
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[logtf 



where 



(3-11) = ^C/c+j/3j-fc- 

k=0 

Since 

oo 

(3.12) E\\h,^p,f = alY,\\x,tf<K 

as noted on p. 1824 of 0], and St is independent of ht — pt, it follows that 

t=l 

Next, we can write 

oo 

(3.14) et- et = -^\jt£-j, 
where 

j 

(3.15) = y^g/c+jA-fc. 

fe=0 



Thus, from Lemma 16 of Robinson [8|, 
(3.16) E\\r2n\?<K 



2 ^ ^^(logn)3 



Finally, 



n 1 

i?||r3n||<-E(£;||/it-pdl'i?(e* -£*)')' 
t=i 



<-y 

~ t 
t=i 

(3.17) <^(W^ 

n 

using (3.12) and also Lemma 14 of Robinson This completes the proof that 
fin = Op{n~'^), i — 1,2, 3. The remainder of the proof is easier, and more standard, 
and is omitted. □ 
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